16 research outputs found
Music-Driven Group Choreography
Music-driven choreography is a challenging problem with a wide variety of
industrial applications. Recently, many methods have been proposed to
synthesize dance motions from music for a single dancer. However, generating
dance motion for a group remains an open problem. In this paper, we present
, a new large-scale dataset for music-driven group dance
generation. Unlike existing datasets that only support single dance, our new
dataset contains group dance videos, hence supporting the study of group
choreography. We propose a semi-autonomous labeling method with humans in the
loop to obtain the 3D ground truth for our dataset. The proposed dataset
consists of 16.7 hours of paired music and 3D motion from in-the-wild videos,
covering 7 dance styles and 16 music genres. We show that naively applying
single dance generation technique to creating group dance motion may lead to
unsatisfactory results, such as inconsistent movements and collisions between
dancers. Based on our new dataset, we propose a new method that takes an input
music sequence and a set of 3D positions of dancers to efficiently produce
multiple group-coherent choreographies. We propose new evaluation metrics for
measuring group dance quality and perform intensive experiments to demonstrate
the effectiveness of our method. Our project facilitates future research on
group dance generation and is available at:
https://aioz-ai.github.io/AIOZ-GDANCE/Comment: accepted in CVPR 202
Addressing Non-IID Problem in Federated Autonomous Driving with Contrastive Divergence Loss
Federated learning has been widely applied in autonomous driving since it
enables training a learning model among vehicles without sharing users' data.
However, data from autonomous vehicles usually suffer from the
non-independent-and-identically-distributed (non-IID) problem, which may cause
negative effects on the convergence of the learning process. In this paper, we
propose a new contrastive divergence loss to address the non-IID problem in
autonomous driving by reducing the impact of divergence factors from
transmitted models during the local learning process of each silo. We also
analyze the effects of contrastive divergence in various autonomous driving
scenarios, under multiple network infrastructures, and with different
centralized/distributed learning schemes. Our intensive experiments on three
datasets demonstrate that our proposed contrastive divergence loss further
improves the performance over current state-of-the-art approaches
Reducing Training Time in Cross-Silo Federated Learning using Multigraph Topology
Federated learning is an active research topic since it enables several
participants to jointly train a model without sharing local data. Currently,
cross-silo federated learning is a popular training setting that utilizes a few
hundred reliable data silos with high-speed access links to training a model.
While this approach has been widely applied in real-world scenarios, designing
a robust topology to reduce the training time remains an open problem. In this
paper, we present a new multigraph topology for cross-silo federated learning.
We first construct the multigraph using the overlay graph. We then parse this
multigraph into different simple graphs with isolated nodes. The existence of
isolated nodes allows us to perform model aggregation without waiting for other
nodes, hence effectively reducing the training time. Intensive experiments on
three public datasets show that our proposed method significantly reduces the
training time compared with recent state-of-the-art topologies while
maintaining the accuracy of the learned model. Our code can be found at
https://github.com/aioz-ai/MultigraphFLComment: accepted in ICCV 202
Deep Metric Learning Meets Deep Clustering: An Novel Unsupervised Approach for Feature Embedding
Unsupervised Deep Distance Metric Learning (UDML) aims to learn sample
similarities in the embedding space from an unlabeled dataset. Traditional UDML
methods usually use the triplet loss or pairwise loss which requires the mining
of positive and negative samples w.r.t. anchor data points. This is, however,
challenging in an unsupervised setting as the label information is not
available. In this paper, we propose a new UDML method that overcomes that
challenge. In particular, we propose to use a deep clustering loss to learn
centroids, i.e., pseudo labels, that represent semantic classes. During
learning, these centroids are also used to reconstruct the input samples. It
hence ensures the representativeness of centroids - each centroid represents
visually similar samples. Therefore, the centroids give information about
positive (visually similar) and negative (visually dissimilar) samples. Based
on pseudo labels, we propose a novel unsupervised metric loss which enforces
the positive concentration and negative separation of samples in the embedding
space. Experimental results on benchmarking datasets show that the proposed
approach outperforms other UDML methods.Comment: Accepted in BMVC 202
Controllable Group Choreography Using Contrastive Diffusion
Music-driven group choreography poses a considerable challenge but holds significant potential for a wide range of industrial applications. The ability to generate synchronized and visually appealing group dance motions that are aligned with music opens up opportunities in many fields such as entertainment, advertising, and virtual performances. However, most of the recent works are not able to generate high-fidelity long-term motions, or fail to enable controllable experience. In this work, we aim to address the demand for high-quality and customizable group dance generation by effectively governing the consistency and diversity of group choreographies. In particular, we utilize a diffusion-based generative approach to enable the synthesis of flexible number of dancers and long-term group dances, while ensuring coherence to the input music. Ultimately, we introduce a Group Contrastive Diffusion (GCD) strategy to enhance the connection between dancers and their group, presenting the ability to control the consistency or diversity level of the synthesized group animation via the classifier-guidance sampling technique. Through intensive experiments and evaluation, we demonstrate the effectiveness of our approach in producing visually captivating and consistent group dance motions. The experimental results show the capability of our method to achieve the desired levels of consistency and diversity, while maintaining the overall quality of the generated group choreography.</jats:p
Overcoming Data Limitation in Medical Visual Question Answering
Traditional approaches for Visual Question Answering (VQA) require large amount of labeled data for training. Unfortunately, such large scale data is usually not available for medical domain. In this paper, we propose a novel medical VQA framework that overcomes the labeled data limitation. The proposed framework explores the use of the unsupervised Denoising Auto-Encoder (DAE) and the supervised Meta-Learning. The advantage of DAE is to leverage the large amount of unlabeled images while the advantage of Meta-Learning is to learn meta-weights that quickly adapt to VQA problem with limited labeled data. By leveraging the advantages of these techniques, it allows the proposed framework to be efficiently trained using a small labeled training set. The experimental results show that our proposed method significantly outperforms the state-of-the-art medical VQA. The source code is available at https://github.com/aioz-ai/MICCAI19-MedVQA